29 research outputs found

    Confidence driven TGV fusion

    Full text link
    We introduce a novel model for spatially varying variational data fusion, driven by point-wise confidence values. The proposed model allows for the joint estimation of the data and the confidence values based on the spatial coherence of the data. We discuss the main properties of the introduced model as well as suitable algorithms for estimating the solution of the corresponding biconvex minimization problem and their convergence. The performance of the proposed model is evaluated considering the problem of depth image fusion by using both synthetic and real data from publicly available datasets

    Discovery and recognition of motion primitives in human activities

    Get PDF
    We present a novel framework for the automatic discovery and recognition of motion primitives in videos of human activities. Given the 3D pose of a human in a video, human motion primitives are discovered by optimizing the `motion flux', a quantity which captures the motion variation of a group of skeletal joints. A normalization of the primitives is proposed in order to make them invariant with respect to a subject anatomical variations and data sampling rate. The discovered primitives are unknown and unlabeled and are unsupervisedly collected into classes via a hierarchical non-parametric Bayes mixture model. Once classes are determined and labeled they are further analyzed for establishing models for recognizing discovered primitives. Each primitive model is defined by a set of learned parameters. Given new video data and given the estimated pose of the subject appearing on the video, the motion is segmented into primitives, which are recognized with a probability given according to the parameters of the learned models. Using our framework we build a publicly available dataset of human motion primitives, using sequences taken from well-known motion capture datasets. We expect that our framework, by providing an objective way for discovering and categorizing human motion, will be a useful tool in numerous research fields including video analysis, human inspired motion generation, learning by demonstration, intuitive human-robot interaction, and human behavior analysis

    Inverse problem theory in shape and action modeling

    Get PDF
    In this thesis we consider shape and action modeling problems under the perspective of inverse problem theory. Inverse problem theory proposes a mathematical framework for solving model parameter estimation problems. Inverse problems are typically ill-posed, which makes their solution challenging. Regularization theory and Bayesian statistical methods, which are proposed in the context of inverse problem theory, provide suitable methods for dealing with ill-posed problems. Regarding the application of inverse problem theory in shape and action modeling, we first discuss the problem of saliency prediction, considering a model proposed by the coherence theory of attention. According to coherence theory, salience regions emerge via proto-objects which we model using harmonic functions (thin-membranes). We also discuss the modeling of the 3D scene, as it is fundamental for extracting suitable scene features, which guide the generation of proto-objects. The next application we consider is the problem of image fusion. In this context, we propose a variational image fusion framework, based on confidence driven total variation regularization, and we consider its application to the problem of depth image fusion, which is an important step in the dense 3D scene reconstruction pipeline. The third problem we encounter regards action modeling, and in particular the recognition of human actions based on 3D data. Here, we employ a Bayesian nonparametric model to capture the idiosyncratic motions of the different body parts. Recognition is achieved by comparing the motion behaviors of the subject to a dictionary of behaviors for each action, learned by examples collected from other subjects. Next, we consider the 3D modeling of articulated objects from images taken from the web, with application to the 3D modeling of animals. By decomposing the full object in rigid components and by considering different aspects of these components, we model the object up this hierarchy, in order to obtain a 3D model of the entire object. Single view 3D modeling as well as model registration is performed, based on regularization methods. The last problem we consider, is the modeling of 3D specular (non-Lambertian) surfaces from a single image. To solve this challenging problem we propose a Bayesian non-parametric model for estimating the normal field of the surface from its appearance, by identifying the material of the surface. After computing an initial model of the surface, we apply regularization of its normal field considering also a photo-consistency constraint, in order to estimate the final shape of the surface. Finally, we conclude this thesis by summarizing the most significant results and by suggesting future directions regarding the application of inverse problem theory to challenging computer vision problems, as the ones encountered in this work

    Point Cloud Structural Parts Extraction based on Segmentation Energy Minimization

    Get PDF
    In this work we consider 3D point sets, which in a typical setting represent unorganized point clouds. Segmentation of these point sets requires first to single out structural components of the unknown surface discretely approximated by the point cloud. Structural components, in turn, are surface patches approximating unknown parts of elementary geometric structures, such as planes, ellipsoids, spheres and so on. The approach used is based on level set methods computing the moving front of the surface and tracing the interfaces between different parts of it. Level set methods are widely recognized to be one of the most efficient methods to segment both 2D images and 3D medical images. Level set methods for 3D segmentation have recently received an increasing interest. We contribute by proposing a novel approach for raw point sets. Based on the motion and distance functions of the level set we introduce four energy minimization models, which are used for segmentation, by considering an equal number of distance functions specified by geometric features. Finally we evaluate the proposed algorithm on point sets simulating unorganized point clouds

    Saliency prediction in the coherence theory of attention

    Get PDF
    AbstractIn the coherence theory of attention, introduced by Rensink, O'Regan, and Clark (2000), a coherence field is defined by a hierarchy of structures supporting the activities taking place across the different stages of visual attention. At the interface between low level and mid-level attention processing stages are the proto-objects; these are generated in parallel and collect features of the scene at specific location and time. These structures fade away if the region is no further attended by attention. We introduce a method to computationally model these structures. Our model is based experimentally on data collected in dynamic 3D environments via the Gaze Machine, a gaze measurement framework. This framework allows to record pupil motion at the required speed and projects the point of regard in the 3D space (Pirri, Pizzoli, & Rudi, 2011; Pizzoli, Rigato, Shabani, & Pirri, 2011). To generate proto-objects the model is extended to vibrating circular membranes whose initial displacement is generated by the features that have been selected by classification. The energy of the vibrating membranes is used to predict saliency in visual search tasks

    Rigid tool affordance matching points of regard

    Get PDF
    In this abstract we briefly introduce the analysis of simple rigid object affordance by experimentally establishing the relation between the point of regard of subjects before grasping an object and the finger tip points of contact once the object is grasped. The analysis show that there is a strong relation between these data, in so justifying the hypothesis that people figures out how objects are afforded according to their functionality

    Bayesian non-parametric inference for manifold based MoCap representation

    Get PDF
    We propose a novel approach to human action recognition, with motion capture data (MoCap), based on grouping sub-body parts. By representing configurations of actions as manifolds, joint positions are mapped on a subspace via principal geodesic analysis. The reduced space is still highly informative and allows for classification based on a non-parametric Bayesian approach, generating behaviors for each sub-body part. Having partitioned the set of joints, poses relative to a sub-body part are exchangeable, given a specified prior and can elicit, in principle, infinite behaviors. The generation of these behaviors is specified by a Dirichlet process mixture. We show with several experiments that the recognition gives very promising results, outperforming methods requiring temporal alignment

    A Pilot Study on Eye-tracking in 3D Search Tasks

    Get PDF

    Component-wise modeling of articulated objects

    Get PDF
    We introduce a novel framework for modeling articulated objects based on the aspects of their components. By decomposing the object into components, we divide the problem in smaller modeling tasks. After obtaining 3D models for each component aspect by employing a shape deformation paradigm, we merge them together, forming the object components. The final model is obtained by assembling the components using an optimization scheme which fits the respective 3D models to the corresponding apparent contours in a reference pose. The results suggest that our approach can produce realistic 3D models of articulated objects in reasonable time
    corecore